Goto

Collaborating Authors

 converge weakly




Discovering Causal Relationships using Proxy Variables under Unmeasured Confounding

Wu, Yong, Fu, Yanwei, Wang, Shouyan, Wang, Yizhou, Sun, Xinwei

arXiv.org Machine Learning

Inferring causal relationships between variable pairs in the observational study is crucial but challenging, due to the presence of unmeasured confounding. While previous methods employed the negative controls to adjust for the confounding bias, they were either restricted to the discrete setting (i.e., all variables are discrete) or relied on strong assumptions for identification. To address these problems, we develop a general nonparametric approach that accommodates both discrete and continuous settings for testing causal hypothesis under unmeasured confounders. By using only a single negative control outcome (NCO), we establish a new identification result based on a newly proposed integral equation that links the outcome and NCO, requiring only the completeness and mild regularity conditions. We then propose a kernel-based testing procedure that is more efficient than existing moment-restriction methods. We derive the asymptotic level and power properties for our tests. Furthermore, we examine cases where our procedure using only NCO fails to achieve identification, and introduce a new procedure that incorporates a negative control exposure (NCE) to restore identifiability. We demonstrate the effectiveness of our approach through extensive simulations and real-world data from the Intensive Care Data and World Values Survey.


The sequence of distributions that converges weakly to π

Neural Information Processing Systems

We are very grateful to all the reviewers for their thoughtful feedback. All typos and minor points will also be fixed. Prop. 3 implies that any inference problem can be decomposed into a sequence of Another consideration, as highlighted by the example of 4.3, is that reducing the Bayesian computation, as the two methods have different computational cost patterns. This is required for each optimization step as well. Currently, however, we haven't found problems where the basis derived from H In the discussion after Prop. 1, we should have The phrase "lack of precision" in 4.4 refers to the finite number of samples drawn from


Asymptotic behavior of eigenvalues of large rank perturbations of large random matrices

Afanasiev, Ievgenii, Berlyand, Leonid, Kiyashko, Mariia

arXiv.org Artificial Intelligence

Random Matrix Theory (RMT) is a classical theory that has been developing for more than 70 years. Initially, RMT arose from problems in nuclear physics and found its applications in mathematics, physics, finance, and many other disciplines. Recently, new problems have been arising from the area of Machine Learning. Indeed, often the weight matrices of Deep Neural Networks (DNNs) are initialized randomly. Moreover, modern DNNs have large weight matrices, which is why their spectral properties can be described by asymptotic behavior of N N random matrices as N goes to infinity.



Statistical and Topological Properties of Sliced Probability Divergences S

Neural Information Processing Systems

We can now prove Theorem 1. Proof of Theorem 1. Now, let us prove the other implication, i.e. Theorem 2. Our result is thus consistent with the existing results in the literature. Next, we show that this result holds for two popular choices of kernels. We conclude that k ˆ k is positive definite, hence (S17) holds for RBF kernels.S1.4 Proof of Theorem 3 Proof of Theorem 3. We start by upper bounding the distance between two regularized measures. The desired result is obtained as a direct application of Theorems 2 and 3.S1.6



Control, Optimal Transport and Neural Differential Equations in Supervised Learning

Phung, Minh-Nhat, Tran, Minh-Binh

arXiv.org Artificial Intelligence

From the perspective of control theory, neural differential equations (neural ODEs) have become an important tool for supervised learning. In the fundamental work of Ruiz-Balet and Zuazua (SIAM REVIEW 2023), the authors pose an open problem regarding the connection between control theory, optimal transport theory, and neural differential equations. More precisely, they inquire how one can quantify the closeness of the optimal flows in neural transport equations to the true dynamic optimal transport. In this work, we propose a construction of neural differential equations that converge to the true dynamic optimal transport in the limit, providing a significant step in solving the formerly mentioned open problem.


Another look at inference after prediction

Gronsbell, Jessica, Gao, Jianhui, Shi, Yaqi, McCaw, Zachary R., Cheng, David

arXiv.org Machine Learning

Prediction-based (PB) inference is increasingly used in applications where the outcome of interest is difficult to obtain, but its predictors are readily available. Unlike traditional inference, PB inference performs statistical inference using a partially observed outcome and a set of covariates by leveraging a prediction of the outcome generated from a machine learning (ML) model. Motwani and Witten (2023) recently revisited two innovative PB inference approaches for ordinary least squares. They found that the method proposed by Wang et al. (2020) yields a consistent estimator for the association of interest when the ML model perfectly captures the underlying regression function. Conversely, the prediction-powered inference (PPI) method proposed by Angelopoulos et al. (2023) yields valid inference regardless of the model's accuracy. In this paper, we study the statistical efficiency of the PPI estimator. Our analysis reveals that a more efficient estimator, proposed 25 years ago by Chen and Chen (2000), can be obtained by simply adding a weight to the PPI estimator. We also contextualize PB inference with methods from the economics and statistics literature dating back to the 1960s. Our extensive theoretical and numerical analyses indicate that the Chen and Chen (CC) estimator offers a balance between robustness to ML model specification and statistical efficiency, making it the preferred choice for use in practice.